Search CORE

111 research outputs found

TICO: a tool for postprocessing the predictions of prokaryotic translation initiation sites

Author: Meinicke P.
Morgenstern B.
Tech M.
Publication venue: Oxford University Press
Publication date: 14/07/2006
Field of study

Exact localization of the translation initiation sites (TIS) in prokaryotic genomes is difficult to achieve using conventional gene finders. We recently introduced the program TICO for postprocessing TIS predictions based on a completely unsupervised learning algorithm. The program can be utilized through our web interface at and it is also freely available as a commandline version for Linux and Windows. The latest version of our program provides a tool for visualization of the resulting TIS model. Although the underlying method is not based on any specific assumptions about characteristic sequence features of prokaryotic TIS the prediction rates of our tool are competitive on experimentally verified test data

Crossref

PubMed Central

Refinement algebra for probabilistic programs

Author: A McIver
A McIver
BA Davey
C Morgan
D Kozen
D Kozen
EW Dijkstra
EW Dijkstra
J Desharnais
J Wright von
Kim Solin
L Lamport
L Meinicke
L Meinicke
LA Meinicke
Larissa Meinicke
P Höfner
RJ Lipton
RJR Back
RJR Back
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/04/2009
Field of study

We identify a refinement algebra for reasoning about probabilistic program transformations in a total-correctness setting. The algebra is equipped with operators that determine whether a program is enabled or terminates respectively. As well as developing the basic theory of the algebra we demonstrate how it may be used to explain key differences and similarities between standard (i.e. non-probabilistic) and probabilistic programs and verify important transformation theorems for probabilistic action systems.29 page(s

Crossref

Macquarie University ResearchOnline

University of Queensland eSpace

Accurate Profiling of Microbial Communities from Massively Parallel Sequencing using Convex Optimization

Author: A. Amir
B.J. Paster
C. Lozupone
C.A. Lozupone
D. Hiller
D. Kessner
D.H. Haft
E.R. Mardis
I. Eskin
J.R. Cole
M. Grant
M. Hamady
N. Segata
P. Meinicke
P.B. Eckburg
S. Pavoine
T.J. Gentry
T.Z. DeSantis
Publication venue
Publication date: 01/01/2013
Field of study

We describe the Microbial Community Reconstruction ({\bf MCR}) Problem, which is fundamental for microbiome analysis. In this problem, the goal is to reconstruct the identity and frequency of species comprising a microbial community, using short sequence reads from Massively Parallel Sequencing (MPS) data obtained for specified genomic regions. We formulate the problem mathematically as a convex optimization problem and provide sufficient conditions for identifiability, namely the ability to reconstruct species identity and frequency correctly when the data size (number of reads) grows to infinity. We discuss different metrics for assessing the quality of the reconstructed solution, including a novel phylogenetically-aware metric based on the Mahalanobis distance, and give upper-bounds on the reconstruction error for a finite number of reads under different metrics. We propose a scalable divide-and-conquer algorithm for the problem using convex optimization, which enables us to handle large problems (with

\sim10^6

species). We show using numerical simulations that for realistic scenarios, where the microbial communities are sparse, our algorithm gives solutions with high accuracy, both in terms of obtaining accurate frequency, and in terms of species phylogenetic resolution.Comment: To appear in SPIRE 1

arXiv.org e-Print Archive

CiteSeerX

Crossref

Principal surfaces from unsupervised kernel regression

Author: H. Ritter
P. Meinicke
R. Memisevic
S. Klanke
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS

Author: A. R. Subramanian
B. Morgenstern
Brudno
Do
E. Corel
Edgar
Edgar
Edgar
Feng
Heringa
Lenhof
Montgomerie
Morgenstern
Morgenstern
Morgenstern
P. Meinicke
Pohler
R. Steinkamp
S. Hiran
Subramanian
Subramanian
Taylor
Thompson
Wong
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

We introduce web interfaces for two recent extensions of the multiple-alignment program DIALIGN. DIALIGN-TX combines the greedy heuristic previously used in DIALIGN with a more traditional ‘progressive’ approach for improved performance on locally and globally related sequence sets. In addition, we offer a version of DIALIGN that uses predicted protein secondary structures together with primary sequence information to construct multiple protein alignments. Both programs are available through ‘Göttingen Bioinformatics Compute Server’ (GOBICS)

CiteSeerX

Crossref

PubMed Central

Streaming fragment assignment for real-time analysis of sequencing experiments

Author: A Ahmadi
A Roberts
Adam Roberts
B Langmead
B Li
B Wold
C Trapnell
D Branton
D Chung
D Lipman
KD Hansen
LD Stein
Lior Pachter
M Taub
O Cappé
P Meinicke
S Anders
S Lee
T Hashimoto
Publication venue: Nature Publishing Group
Publication date: 01/01/2013
Field of study

We present eXpress, a software package for efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data and show that eXpress achieves greater efficiency than other quantification methods

Crossref

PubMed Central

Caltech Authors

Joint Analysis of Multiple Metagenomic Samples

Author: A Kislyuk
B Yang
C Chan
Chris P. Ponting
D Cohn
D Cohn
D Huson
D Lee
D Richter
D Rusch
Eran Halperin
H Leung
H Teeling
I Jolliffe
J Hartigan
J Qin
J Sivic
M Arumugam
M Chiang
M Hamady
M Takahashi
M Wendl
P Meinicke
P Turnbaugh
S Chatterji
S Karlin
T Brants
T Hofmann
T Hofmann
T Hofmann
W Kent
X Jiang
Y Wu
Yael Baran
Publication venue: Public Library of Science
Publication date: 16/02/2012
Field of study

The availability of metagenomic sequencing data, generated by sequencing DNA pooled from multiple microbes living jointly, has increased sharply in the last few years with developments in sequencing technology. Characterizing the contents of metagenomic samples is a challenging task, which has been extensively attempted by both supervised and unsupervised techniques, each with its own limitations. Common to practically all the methods is the processing of single samples only; when multiple samples are sequenced, each is analyzed separately and the results are combined. In this paper we propose to perform a combined analysis of a set of samples in order to obtain a better characterization of each of the samples, and provide two applications of this principle. First, we use an unsupervised probabilistic mixture model to infer hidden components shared across metagenomic samples. We incorporate the model in a novel framework for studying association of microbial sequence elements with phenotypes, analogous to the genome-wide association studies performed on human genomes: We demonstrate that stratification may result in false discoveries of such associations, and that the components inferred by the model can be used to correct for this stratification. Second, we propose a novel read clustering (also termed “binning”) algorithm which operates on multiple samples simultaneously, leveraging on the assumption that the different samples contain the same microbial species, possibly in different proportions. We show that integrating information across multiple samples yields more precise binning on each of the samples. Moreover, for both applications we demonstrate that given a fixed depth of coverage, the average per-sample performance generally increases with the number of sequenced samples as long as the per-sample coverage is high enough

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Learning a peptide-protein binding affinity predictor with kernel ridge regression

Author: A Dömling
AJ Bordner
AJ Bordner
AJ Smola
Alexandre Drouin
AR Ortiz
B Hoffmann
B Peters
B Schölkopf
C Rasmussen
CS Leslie
Dana-Farber Cancer Institute
François Laviolette
G Rätsch
H Saigo
J Qiu
J Robinson
J Shawe-Taylor
J Swets
J Wells
Jacques Corbeil
JL Faulon
JM Perez-De-Vega
L Costantino
L Jacob
L Jacob
L Zhang
M Hue
M Nielsen
M Nielsen
M Takarabe
Mario Marchand
N Nagamine
N Toussaint
P Meinicke
P Vanhee
P Vanhee
P Zhou
PL Toogood
R Albert
Sébastien Giguère
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/07/2012
Field of study

We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalize eight kernels, such as the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of accurately predicting the binding affinity of any peptide to any protein. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. On all benchmarks, our method significantly (p-value < 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. The method should be of value to a large segment of the research community with the potential to accelerate peptide-based drug and vaccine development.Comment: 22 pages, 4 figures, 5 table

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Identification and analysis of methylation call differences between bisulfite microarray and bisulfite sequencing data with statistical learning techniques

Author: AE Teschendorff
G Rätsch
Gilles Gasparoni
Jasmin Gries
Jörn Walter
Karl Nordström
Matthias Döring
Nico Pfeifer
P Meinicke
Pavlo Lutsik
S Dedeurwaerder
S Sonnenburg
T Gärtner
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref